Model Selection

Zero-shot Generalization

# Zero-shot Generalization

Visualclozepipeline 384

VisualCloze is a universal image generation framework based on visual context learning, supporting generalization across multiple in-domain tasks and unseen tasks, generating target images and intermediate results in a single step.

PoseLess is an innovative robotic hand control framework that directly maps 2D images to joint angles using projection representations, eliminating the need for explicit pose estimation.

Multimodal Fusion

Poseless-3B is a vision-language model (VLM)-based robotic hand control framework that directly maps 2D images to joint angles without explicit pose estimation.

Pose Estimation

Colqwen2.5 V0.1

A visual retrieval model based on Qwen2.5-VL-3B-Instruct and ColBERT strategy, capable of generating multi-vector representations for text and images to enable efficient document retrieval.

Safetensors English

A visual retrieval model based on Qwen2-VL-2B-Instruct and ColBERT strategy, capable of efficiently indexing documents through visual features

Text-to-Image English

Sam2 Hiera Large

A foundational model for promptable visual segmentation in images and videos developed by FAIR

Image Segmentation

OpenVLA 7B is an open-source vision-language-action model trained on the Open X-Embodiment dataset, capable of generating robot actions based on language instructions and camera images.

Transformers English

OpenVLA v0.1 7B is an open-source vision-language-action model trained on the Open X-Embodiment dataset, supporting various robot controls.

Transformers English

Biomednlp KRISSBERT PubMed UMLS EL

KRISSBERT is a knowledge-enhanced self-supervised learning model for biomedical entity linking. It trains contextual encoders using unannotated text and domain knowledge to effectively address the diversity and ambiguity of entity names.

Knowledge Graph

Transformers English

A fine-tuned text-to-SQL conversion model based on T5-3B architecture, significantly improving structured query generation accuracy through PICARD constrained decoding

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase